Discreteness and the origin of probability in quantum mechanics 
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Attempts to derive the Born rule, either in the Many Worlds or Copenhagen interpretation, are 
unsatisfactory for systems with only a finite number of degrees of freedom. In the case of Many 
Worlds this is a serious problem, since its goal is to account for apparent collapse phenomena, 
including the Born rule for probabilities, assuming only unitary evolution of the wavefunction. For 
finite number of degrees of freedom, observers on the vast majority of branches would not deduce 
the Born rule. However, discreteness of the quantum state space, even if extremely tiny, may restore 
the validity of the usual arguments. 



INTRODUCTION: THE PROBLEM WITH 
PROBABILITY 



Quantum mechanics exhibits an odd dichotomy in the 
time evolution of states. A quantum state undergoes de- 
terministic, unitary evolution until a measurement causes 
probabilistic, non-unitary collapse. While many physi- 
cists do not feel that there is anything wrong with this 
standard Copenhagen picture, it seems less than econom- 
ical to postulate two fundamental processes — unitary 
evolution and non-unitary measurement — if somehow 
one could suffice. Everett |l| proposed that unitary time 
evolution of a closed system is sufficient to account for the 
appearance of measurement collapse to observers inside 
the system (see also Hartle |2| and DeWitt and Graham 
0), in what has now become known as the Many Worlds 
(MW) formulation of quantum mechanics. 

The MW interpretation is regarded as extravagant, 
and hence implausible, by many (including at least one of 
the authors), because of the huge multiplicity of branches 
of the wavefunction, each of which is presumed to be as 
real as the others |5j- Before the anti-MW reader aban- 
dons this paper, we note that the discussion that follows 
applies also to the conventional Copenhagen interpreta- 
tion, with measurement collapse, and may allow a deriva- 
tion of probability in quantum mechanics from a weaker 
initial assumption, known as the certainty assumption, 
along the lines of Hartle (see also Farhi, Goldstone 
and Gutmann and Coleman and Lesniewski [j]). An 
attractive doctrine (preferred by one of the authors) is 
the minimalist view outlined by Hartle insisting that 
physics should be done without ill-defined words and slo- 
gans such as "The other worlds are just as real." Our 
analysis could also be read within this post-Everett or 
decoherent histories approach. 

We focus on the Born rule in quantum mechanics, and 
the extent to which it can be derived. The Born rule 
states that given an observable A with spectrum Aj and 
eigenstates \if>i), the probability of Ai as the outcome of 
a measurement on state \tp) is Pi = \(ipi\ip)\ 2 . It has 
been claimed by Everett, Hartle, and others, that this 
rule arises as a consequence of the assumption of uni- 



tary evolution, but as we discuss below, the derivation is 
unsatisfactory for any system with only a finite number 
of degrees of freedom. (For recent discussions of the Born 
rule in MW, see 0.) 

In a recent paper |fj] we speculated that quantum grav- 
ity and related considerations may imply that quantum 
state space is itself discrete. We will review our argu- 
ment in the next section. Here we point out that one 
consequence of this discreteness in state space may be 
the emergence of the Born rule, even in the case when 
the number of degrees of freedom is finite. 

The original derivation of the Born rule given by Ev- 
erett 0, Hartle 0, and others, is quite simple. Consider 
an ensemble of identically prepared states 
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and a sequence of outcomes S = (si, S2, . . . , sjv) ob- 
tained from measurements on each of the states. The 
probability P(S) of a given sequence, or class of se- 
quences, calculated using the Born rule, is identical to 
the norm (magnitude) squared of the projection of \P onto 
eigenstates with the eigenvalues (si, S2, ■ ■ ■ , sn), namely 
\(siS2 ■ ■ ■ sn\'P)\ 2 . As Everett noted, it follows that an 
improbable sequence corresponds to a component of W 
(in the eigenstate basis) with small magnitude. In the 
formal limit N — > oo, components of & which do not cor- 
respond to statistically typical sequences generated by 
the Born rule have zero magnitude (i.e. converge to the 
null vector), and therefore do not correspond to physical 
states. From the frequentist perspective on probability, 
then, the Born rule is a consequence of excluding zero 
norm states from the Hilbert space. 

To further elucidate, consider a simple example us- 
ing spin states. Let = c+|+) + c_|— ), and define 
p± = \c±\ 2 . Then a sequence of measurement outcomes 

will be of the form S = {+ H 1 }. If the sequence 

is generated by the Born rule, then in the limit of large 
N, the fraction of (+) outcomes will be p+ to very good 
approximation. Any other value for the fraction of (+) 
outcomes has zero probability at infinite N . Correspond- 
ingly, the magnitude squared \{s\s% ■ ■ ■ sn\^)\ 2 is zero for 
any state (s\S2 • • • sjv| m which the fraction of outcomes 
Si equal to (+) is not p+. 
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This can be generalized: if (S1S2 • • • sjv| corresponds 
to a sequence S = {si, S2, ■ ■ ■ , sat} which is statistically 
atypical according to the Born rule, its overlap with 
will vanish when N — > oo. Everett referred to these 
branches of the wavefunction as "maverick worlds" — 
observers on these branches would not deduce the Born 
rule. Below, we will repeat this discussion for those read- 
ers who prefer a more standard Copenhagen interpreta- 
tion to the MW interpretation. 

We can define parameters characterizing the deviation 
of a maverick world from the central Born value. For 
example, in the spin example, we might consider / + to 
be the frequency of (+) outcomes, so that 6 = /+ — p + is 
the deviation parameter. Then any branch with non-zero 
5 will have vanishing norm in the large N limit. When 
N is strictly infinite all maverick worlds have zero norm. 
The remaining branches have outcomes S which satisfy 
the Born rule in the frequentist sense. 

The problem with this reasoning is of course that N 
is never strictly infinite. In fact, given the finite size of 
the causal horizon of our universe and an ultraviolet cut- 
off on modes (e.g., from the Planck scale), we obtain a 
finite, although very large, upper limit on the number 
of outcomes N which characterize any particular branch 
of the MW wavefunction. Without invoking something 
like the Born rule — a correspondence between probability 
and norm — there is no reason to exclude branches with 
small but non-zero norm. The problem is exacerbated 
by the fact that maverick worlds are generally far more 
numerous than non-maverick worlds. The MW wavefunc- 
tion branches with each measurement, regardless of how 
small either of |c±| 2 is. This leads to 2 N total branches 
after N measurements. Even if, e.g., |c+| 2 is much larger 
than |c_| 2 , both (+) and (— ) outcomes will still occur at 
each branch, and the structure of the tree is independent 
of c± as long as neither is zero. The overwhelming major- 
ity of branches will have roughly equal numbers of (+) 
and (— ) outcomes. Thus the multiplicity of maverick 
worlds is enormously larger than non-maverick worlds, 
although their collective magnitude is vanishingly small. 
Again, without assuming the Born rule, we have no a pri- 
ori reason to exclude small (but non-zero) norm states. 

Of course, a strict frequentist interpretation of prob- 
ability requires an infinite sequence of outcomes. How- 
ever, the use of probability by physicists is more Bayesian 
than frequentist: confronted with a finite sequence of 
outcomes, S — (s±, S2, ■ ■ ■ , Sjy), our goal is to deduce a 
predictive model for subsequent outcomes. In this way, 
we deduce the Born rule based on the limited number of 
measurements thus far performed on quantum systems. 

As mentioned, our discussion may be of interest even to 
those who do not accept MW, as it pertains to the origin 
of the Born rule within the Copenhagen, or measurement 
collapse, interpretation. In particular, it has been pro- 
posed by Hartle that the Born rule can be derived 
from the weaker certainty assumption, stating that when 



a measurement of an observable A is performed on an 
eigenstate \a) of A, the value a is obtained with certainty. 
Taking A to be, for example, the frequency operator for 
(+) outcomes, or any other statistical property, Hartle 
found that for N infinite, \P is an eigenstate of each of 
these statistical operators, with eigenvalues given by the 
Born rule. 

The discussion parallels that in the MW interpretation. 
In the standard Copenhagen picture the state \P is, in 
the eigenstate basis, a sum of 2 N terms, each term being 
in one-to-one correspondence with a MW branch or a 
universe. In the Copenhagen interpretation the outcomes 
S result from measurements on an ensemble, whereas 
in MW they specify a particular branch or decoherent 
history 01 °f the wavefunction of the entire universe. 
The mathematics is the same in either picture: maverick 
terms collectively have a very small norm that approaches 
zero as N approaches infinity. 

This has the same weakness as the earlier MW argu- 
ment. For any finite N, the state W is only approximately 
an eigenstate of the frequency operator. The certainty 
assumption does not specify the outcome of a measure- 
ment on an approximate eigenstate, and going further 
requires an assumption relating the norm of a state vec- 
tor to the probability of a measurement outcome, which 
is essentially the Born rule. 

DISCRETE STATE SPACE 

Consider normalized states \P = %jj <E> ■ ■ ■ <E> tfj and 
\P' = ip' ® • • • ® Suppose that, due to fundamen- 
tal discreteness, one cannot distinguish ip and ip' when 
| ip — ip'\ < e. This implies that the direct product states 
cannot be distinguished when (assuming y/~Ne C 1) 

\V-W'\< y/Ne. (2) 

(We have assumed that (ip\ip') is real, which would be the 
case if ip' resulted from rotating ip slightly on the Bloch 
sphere. Relative phases could lead to order Ne terms 
in Eq. which allow an acceptable cutoff of maverick 
branches for even smaller discreteness scale e.) Moti- 
vated by this observation, we assume that any (maver- 
ick!) components of W with norm less than VN e can be 
removed from the wavefunction. 

We argued in Ref. [9j that quantum gravity suggests a 
discreteness scale of order e ~ E, where E is the charac- 
teristic energy of the system described by ip, in Planck 
units. Equivalently, e ~ L^ 1 , where L is the character- 
istic size, or Compton wavelength, of the system. We 
can motivate this result by noting that quantum grav- 
ity seems to imply a minimal length yjj of order the 
Planck length. A minimal length restricts our ability to 
distinguish two different orientations of an experimental 
apparatus, such as a Stern-Gerlach device for measuring 
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the orientation of a spin. (Rotation of the device by an 
angle less than L _1 does not displace any component by 
more than the Planck length.) Thus, the resulting ambi- 
guity in the spin state even after an ideal measurement 
is at least of order e given above (see Fig.^l. There is no 
way to ensure that the ensemble states ip are identical to 
accuracy better than e. For example, each time we pass 
a spin through the Stern-Gerlach device to produce an- 
other ip there can be no guarantee that the Stern-Gerlach 
device remains in precisely the same orientation. 

While some might consider fundamental discreteness 
of the space of quantum states (previously referred to in 
the earlier paper as discrete Hilbert space 0) to be 
a radical notion, we find asserting its absolute continuity 
in the absence of any supporting experimental evidence 
to be perhaps just as speculative. Consider the case of 
spacetime: few would claim that spacetime must be ab- 
solutely continuous (in fact, most likely it is not ^2); 
why should quantum state space be different? 

It is worth emphasizing that the discreteness we pro- 
pose has nothing to do with the dimensionality of state 
space. Rather, it has to do with whether the coefficients 
Ci in an eigenstate expansion \ip) = X)i c «K) are contin- 
uous or can only take on a discrete set of values (see 
Fig. 1). 

We have not specified the concrete realization of dis- 
creteness, other than to assume that states can be defined 
only modulo some fundamental uncertainty. There are 
many ways to define the evolution of a state in a discrete 
state space. One method would be to write the time evo- 
lution operator e~ tHt as a product of discrete evolution 
operators e~ lHAt and apply this product of operators se- 
quentially to the state, followed by the "snap to" rule 
("snap to nearest lattice site"; see Fig.^) after each step. 
This is equivalent to taking classical digital computer 
simulations literally. That is, by accepting the finite pre- 
cision of the variable ip(x) in an ordinary computer pro- 
gram, one obtains a naive discretization of Hilbert space 
with the "snap to" rule implemented by simple numeri- 
cal rounding. With limited numerical precision, branches 
of the wavefunction with very small norm are eventually 
discarded. This scheme leads to small violations of linear 
superposition, but only at the level of e. 

Interestingly, for e ~ L , the condition that discrete- 
ness have only a small effect on ^/Ne -C 1, leads to a 
condition on the number of degrees of freedom reminis- 
cent of holography 13]: 



N < L 2 ~ A, 



(3) 



where A is the surface area of the region. This bound 
implies far fewer degrees of freedom than the usual exten- 
sive scaling N ~ L 3 . It can be deduced as a constraint 
from gravitational collapse [l4j . Excluding states from 
the Hilbert space of the L 3 volume which would have al- 
ready caused gravitational collapse to a black hole, we 
find the stronger condition N < A 3 / 4 ~ L 3 / 2 . 




FIG. 1: A possible discretization of the Bloch sphere (qubit 
state space). Points on each disc (of size e) are identified. 
Points between discs can be assigned to the nearest disc. 



NO MAVERICK WORLDS 

Consider the spin example from the first section. Let 
n = n + = f+N be the number of (+) outcomes in the 
sequence S. We suppress the + subscript in what follows. 
For N ^> 1, the function 



P(n) 



N 



p n (l-p) 



N-n 



(4) 



has a sharp maximum at n = pN and rapidly decreases 
for n sufficiently far from it. The maximum results from 
a competition between the combinatorial factor (multi- 
plicity), which is peaked at n — N/2, and the product 
p n (l — p) N ~ n , which is peaked at either n = or n = N, 
unless p_ is extremely close to p+. It follows that when 
calculating P(n) for n not too far from pN, we make a 
negligible error by assuming «>1 and N — n» 1 . The 
Stirling formula gives 



P(fN) « [2wNf(l - f)Y 1/2 cxp [-N4>(f)\ , (5) 



where 



</>(f) = f In (f/p) + (1 - /) In [(1 - /)/(l - p)] (6) 

and / = n/N. For large N this becomes sharply peaked. 
Expanding <fi(f) around / = p, we find 



P(/JV)w [2ttN P (1 - p)}- 1/2 



exp 



N(j-p? 
2p(l-p) 



(7) 



The collective magnitude squared of all maverick states 
\S,N) with frequency deviation \S\ = \ f — p\ greater than 
8q is 



!<5|><5o 



2N I df P(fN) . (8) 



One contribution to the sum comes from the range / € 
[0,p — So] and the other from the range / e [p + So, 1]. 
Note that we have replaced /(! — /) in the overall factor 
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in P(fN) by p(l — p). The resulting error should be 
negligible for our purposes here. 

Requiring that this collective magnitude squared is less 
than TVe 2 yields 

<5 >iV- 1 / 2 [2p(l-p)|ln(iV e 2 )|] 1/2 . (9) 

The maximum deviation S for undiscarded branches van- 
ishes as TV — > oo for fixed p, e. If, for finite TV, an exper- 
imenter could measure all TV outcomes which define his 
branch of the wavefunction, he might find a deviation 
from the predicted Born frequency / = p as large as 
| ln(iVe 2 ) | 1//2 standard deviations (i.e., measuring the de- 
viation in units of TV -1 / 2 ). Note that we are working in 
the regime TVe 2 « 1. If the discussion in Ref. Q offers 
a valid guide, the number e may be much smaller than 
10" 20 , so that even if TV is as large as Avogadro's number, 
TVe 2 will still be a small number (see example below). 

However, an experimenter is unlikely to be able to mea- 
sure more than a small fraction of the outcomes that 
determine his branch. Recall that in MW a particular 
branch of the wavefunction is specified by the sequence 
of outcomes S — (si, S2, ■ ■ ■ , sjv)- TV is the total num- 
ber of decoherent outcomes on a branch, so it is typically 
enormous — at least Avogadro's number if the system con- 
tains macroscopic objects such as an experimenter. The 
experimental outcomes available to test Born's rule will 
be a much smaller number TV* <C TV corresponding to a 
subset of the Sj directly related to the experiment. Any 
deviation from the Born rule of order TV -1 / 2 will be well 

— 1/2 

within the experimental statistical error of order TV, 
Therefore the Born rule will be observed to hold in all the 
branches which remain after truncation due to discrete- 
ness. This would, however, not be true if we were to set 
e to zero, in which case | h^TVe 2 )! 1 / 2 would be infinite. 

For definiteness, consider the following numerical ex- 
ample. Let the discreteness scale be truly tiny: e ~ 
KT 100 , and let TV - 10 160 , which is the Hubble four- 
volume in fermis. Then | ln(iVe 2 )j x ' 2 ~ 10, so unless 
experimenters can measure more than 10~ 2 TV ~ 10 158 
quantum outcomes, they will have insufficient statistics 
to exclude any of the maverick branches which remain 
after truncation. 

COPENHAGEN AGAIN 

If we assume the Copenhagen (collapse) interpretation, 
our analysis describes when the Born rule can be sup- 
planted by the weaker assumption of certainty of mea- 
surement outcome when the measured state is an eigen- 
state. In a discrete Hilbert space it is natural to ex- 
tend the notion of eigenstate, so that states within the 
discreteness distance e of an eigenstate will also be con- 
sidered eigenstates. (More precisely, we cannot distin- 
guish between any two such states.) As discussed in the 



previous section, for large (but finite) TV, is approxi- 
mately an eigenstate of any statistical operator (such as 
the frequency operator, but also higher moments) with 
eigenvalue equal to the Born rule value. For example, 
the wavefunction is sharply peaked at the Born rule fre- 
quency value of / = p. If, motivated by the discreteness 
scale e, we simply modify the certainty assumption to in- 
clude states which are approximate eigenstates, we will 
have deduced the Born rule from a more elementary as- 
sumption. 

There is, however, a technical difficulty in defining how 
close a state & is to being an eigenstate of an operator 
such as the frequency operator. It would be natural to 
impose a certainty criteria as follows. Given \P satisfying 

\\P-W f \<VNe, (10) 

where <Py is an eigenstate of the frequency operator with 
eigenvalue /, we identify W with tyf and require that a 
measurement of the frequency on ^ return the value / 
with certainty. The problem arises because, for finite 
TV, no choice of W — ^^=1^°^ 1S an exac t eigenstate of 
the frequency operator (except in the trivial cases where 
tp is already an eigenstate such as |+) or |— ), and in 
those cases / is either zero or one). The state H/f does 
not exist, except in the limit TV — *■ oo, so the distance 
criteria in Eq. IjlOjl cannot be defined. (W and Wj live in 
Hilbert spaces of very different dimensions.) One has to 
rely on some other criterion for identifying a state W as 
a frequency eigenstate. 

One possibility is to use the width of |!^| 2 about the 
maximum, in comparison to some e-dependent quan- 
tity. When the width is sufficiently small, the cer- 
tainty assumption is assumed to apply. Consider a self- 
adjoint operator A, its eigenvectors ipi and eigenvalues 
Xi, Aipi = Xiipi, (tpi\ipj) = % = l,...,n). (For 
the qubit case A is the spin operator and n = 2.) For 
a state ip = X^i ^*' projection operators Pj satisfy 
Piip = Ciipi. This gives (ip\Pi\ip) = \c t \ 2 = pi and 
S™=i Pi — 1- Let us consider the state of TV copies of tp, 
& — ®a=i^ aS> ■ The frequency operators for the eigenval- 
ues Xi are 

n N 

Fi = N- 1 £ £%,®f =1 P jV (11) 

j'l,... ,jjv=l a=l 

We find 

(V\F i \V)=p i , (12) 
(V\F 2 \V) =N- 1 p l + N-\N -l)pl (13) 

and the variances are (AFi) 2 — N~ 1 pi(l —pi). 
Consider ip' = J27=i c i^i close to ip, and require 

(p l -^) 2 <min{(AF l ) 2 ,(A^') 2 }- (14) 
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This gives 

la-ctf <N- X (l- Pi ), (15) 

which leads to 

\ip-ip'\ 2 < N-\n- 1). (16) 

This condition is satisfied if we require \ip — ip'\ 2 <C e 2 , 
recalling that Ne 2 < 1. It is natural to identify the two 
states ip and tp 1 , and consider them both approximate 
eigenstates of the frequency operator. 

CONCLUSIONS 

We argued that attempts to derive the Born rule, either 
in the Many Worlds or Copenhagen interpretation, are 
unsatisfactory for systems with only a finite number of 
degrees of freedom. For Many Worlds this is a serious 
problem, since its goal is to account for apparent collapse 
phenomena — including the Born rule for probabilities — 
assuming only unitary evolution of the wavefunction. For 
finite number of degrees of freedom, observers on the vast 
majority of branches would not deduce the Born rule. 

However, we noted that discreteness of the quantum 
state space, even if extremely tiny, may restore the valid- 
ity of the usual arguments. Some may regard discreteness 
as a radical proposal. We might argue that it is actually 
less speculative than absolute continuity, something that 
can never be experimentally verified. 
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